What implementation and translation teach us: the case of semantic similarity measures in wordnets
نویسندگان
چکیده
Wordnet::Similarity is an important instrument used for many applications. It has been available for a while as a toolkit for English and it has been frequently tested on English gold standards. In this paper, we describe how we constructed a Dutch gold standard that matches the English gold standard as closely as possible. We also re-implemented the WordNet::Similarity package to be able to deal with any wordnet that is specified in Wordnet-LMF format independent of the language. This opens up the possibility to compare the similarity measures across wordnets and across languages. It also provides a new way of comparing wordnet structures across languages through one of its core aspects: the synonymy and hyponymy structure. In this paper, we report on the comparison between Dutch and English wordnets and gold standards. This comparison shows that the gold standards, and therefore the intuitions of English and Dutch native speakers, appear to be highly compatible. We also show that our package generates similar results for English as reported earlier and good results for Dutch. To the contrary of what we expected, some measures even perform better in Dutch than English.
منابع مشابه
Presentation of an efficient automatic short answer grading model based on combination of pseudo relevance feedback and semantic relatedness measures
Automatic short answer grading (ASAG) is the automated process of assessing answers based on natural language using computation methods and machine learning algorithms. Development of large-scale smart education systems on one hand and the importance of assessment as a key factor in the learning process and its confronted challenges, on the other hand, have significantly increased the need for ...
متن کاملPresentation of an efficient automatic short answer grading model based on combination of pseudo relevance feedback and semantic relatedness measures
Automatic short answer grading (ASAG) is the automated process of assessing answers based on natural language using computation methods and machine learning algorithms. Development of large-scale smart education systems on one hand and the importance of assessment as a key factor in the learning process and its confronted challenges, on the other hand, have significantly increased the need for ...
متن کاملTranslation Invariant Approach for Measuring Similarity of Signals
In many signal processing applications, an appropriate measure to compare two signals plays a fundamental role in both implementing the algorithm and evaluating its performance. Several techniques have been introduced in literature as similarity measures. However, the existing measures are often either impractical for some applications or they have unsatisfactory results in some other applicati...
متن کاملTranslation Invariant Approach for Measuring Similarity of Signals
In many signal processing applications, an appropriate measure to compare two signals plays a fundamental role in both implementing the algorithm and evaluating its performance. Several techniques have been introduced in literature as similarity measures. However, the existing measures are often either impractical for some applications or they have unsatisfactory results in some other applicati...
متن کاملEnglish-Persian Plagiarism Detection based on a Semantic Approach
Plagiarism which is defined as “the wrongful appropriation of other writers’ or authors’ works and ideas without citing or informing them” poses a major challenge to knowledge spread publication. Plagiarism has been placed in four categories of direct, paraphrasing (rewriting), translation, and combinatory. This paper addresses translational plagiarism which is sometimes referred to as cross-li...
متن کامل